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Semi-parametric time series modelling with 
autocopulas 


Antony Ware, Ilnaz Asadzadeh 


Abstract In this paper we present an application of the use of autocopulas for mod¬ 
elling financial time series showing serial dependencies that are not necessarily lin¬ 
ear. The approach presented here is semi-parametric in that it is characterized by a 
non-parametric autocopula and parametric marginals. One advantage of using auto¬ 
copulas is that they provide a general representation of the auto-dependency of the 
time series, in particular making it possible to study the interdependence of values 
of the series at different extremes separately. The specific time series that is studied 
here comes from daily cash flows involving the product of daily natural gas price 
and daily temperature deviations from normal levels. Seasonality is captured by us¬ 
ing a time dependent normal inverse Gaussian (NIG) distribution fitted to the raw 
values. 


1 Introduction 


In this study, autocopulas are used to characterise the joint distribution between 
successive observations of a scalar Markov chain. A copula joins a multivariate dis¬ 


tribution to its marginals, and its existence is guaranteed by Sklar’s theorem Sklar 
|1959| . In particular, a Markov chain of first order with any given univariate margin 
can be constructed from a bivariate copula. A theoretical framework for the use of 
copulas for simulating time series was given by Darsow et al.j [1992| , who presented 
necessary and sufficient conditions for a copula-based time series to be a Markov 
process, but not necessarily a stationary one. They presented theorems specifying 
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when time series generated using time varying marginal distributions and copulas 
are Markov processes. Joe] (1997} proposed a class of parametric stationary Markov 


models based on parametric copulas and parametric marginal distributions. Chen 


and Fan [ |2006| studied the estimation of semiparametric stationary Markov mod¬ 


els, using non-parametric marginal distributions with parametric copulas to generate 
stationary Markov processes. 

The term ’autocopula’ was first used to describe the unit lag self dependence 
structure of a univariate time series in |Rakonczai et al.| |2012|, and we ado pt the 
terminology here. We make use of the framework presented in Darso w et al.| fl992] 
to produce Markov processes such that the marginal distribution changes over time. 
The main benefit of using autocopulas for univariate time series modelling is that 
the researcher is able to specify the unconditional (marginal) distribution of X t sep¬ 
arately from the time series dependence of X t ( Patton |2009| ). We apply a semipara¬ 
metric method which is characterized by an empirical autocopula and a parametric 
time varying marginal distribution. This allows us to capture seasonal variations 
in a natural way. This is an important feature of our model, motivated by the fact 
that many financial and economic time series exhibit seasonality, particularly those 
arising from energy and commodity markets. 

The remainder of this paper is organized as follows. In Section [2| we introduce 
the data; Section [3] describes the model, including a review of copulas, and of the 
Normal Inverse Gaussian distribution. This section also includes details of the cali¬ 
bration and simulation procedures, and the final section presents some results. 


2 The data 


The motivation from this project came from the desire to develop a parsimonious 
model that could capture so-called load-following (or swing) risk. This is one of 
the main sources of financial uncertainty for an energy retailer, and arises from the 
combination of retail customer consumption (volume) uncertainty and price uncer¬ 
tainty. Both volume (V) and price ( P ) are driven to a large extent by weather. In 
particular, average daily temperature is one of the main drivers of daily natural gas 
consumption in various North American markets: this in turn drives market prices 
through a supply and demand process. 

Some of the load-following risk exposure can be hedged using gas forwards and 
temperature derivatives. The most significant part that cannot be easily hedged is 
directly linked to the daily product between the weather deviation from normal and 
the daily price deviation from the expected value of the ex-ante forward price. To 
make this more specific, let P denote the last-traded forward monthly index price, 
and V the expected monthly average volume. Cash flows for the retailer depend on 
the product PV, and the uncertainty in this quantity can be written 

(P + AP)(V + AV) — PV = PAV + VAP + APAV. 





















Semi-parametric time series modelling with autocopulas 


3 


We posit a linear relationship between volume and weather deviations, so that 
AV = PAW + £, where /3 is the sensitively of consumption to weather, which can 
be determined from load data for different regions. Forward instruments in weather 
and natural gas markets can then be used to hedge risks corresponding to the terms 
PAV and VAP. Apart from the error term £ in the volume-weather relationship, 
which we assume to be relatively small, it can be seen that the term A PAW be¬ 
comes the main driver of unhedged risk in these cashflows. One approach would 
be to develop separate models for weather and natural gas prices (both daily and 
forward prices). However, because of the desire for parsimony, we instead seek a 
model that allows us to study the time-series X t = ( APAW ) t , in order to estimate 
the range and probabilities of possible outcomes at the level of a complex portfolio 
of retail load obligations. 

Here we study the North American market and focus on the Algonquin location 
for the weather data. The data cover the period 1 January 2003 - 31 June 2014 on 
a daily basis, and are shown in Figure [T] The most dramatic feature of the graph 
is the presence of intermittent clusters of spikes, during which the gas prices rise 
from their approximate average daily value and at the same time temperature rises 
or falls drastically. These mostly occur during winter, although large deviations also 
occur at other times of the year. It is also clear that the marginal densities of these 
observations will not be well-represented by normal distributions. 


Fig. 1 Product of weather and 
gas price deviations (APAW) 
in Algonquin over 2003- 
14. Spikes correspond to 
combinations of high weather 
deviation from normal and 
high spot price deviation from 
next forward month. 



3 The model 


Here we introduce the simulation model in more detail, providing a brief review 
of copulas, and the normal inverse Gaussian distribution, which we use for the 
marginal densities. 
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3.1 Copulas and autocopulas 


A copufiQis a multivariate distribution function defined on a unit cube [0,1]", with 
uniformly distributed marginals. In the following, we use copulas for the interdepen¬ 
dence structure of time series and, for simplicity and the fact that we are interested 
in the first order lag interdependence, we focus on the bivariate case, although the 
approach can be used to capture dependence on higher order lags. 

Let F\ 2 (x,y) be the joint distribution function of random variables X and Y 
whose marginal distribution functions, denoted as F\ and respectively, are contin¬ 
uous. Sklar’s theorem specifies that there exists a unique copula function C(w, v) = 
F n (F-\u),F^\v)) that connects F\ 2 {x,y) to F\ (pc) and /^(y) via F\ 2 (jc,y) = 
C{F\(x),F 2 (y )). The information in the joint distribution F\ 2 (x,y) is decomposed 
into that in the marginal distributions and that in the copula function, where the 
copula captures the dependence structure between X and Y. Various families of 
parametric copulas are widely used (Gaussian, Clayton, Joe, Gumbel copulas, for 
example). 

In a time series setting, we use a copula (or autocopula ) to capture the depen¬ 
dence structure between successive observations. More generally, we have the fol¬ 
lowing definition ([Rakonczai et al. |2012[ ). 


Definition 1 (Autocopula). Given a time series X t and ££ = {/*• E Z + ,z = 1,..., d} 
a set of lags, the autocopula Cx is defined as the copula of the d + 1 dimensional 
random vector (X t , X t -i x ,..., X t -i d ). 

If a times series X t is modelled with an autocopula model with unit lag, with 
autocopula function C(u, v) = Cx ,i (w, v), and (time-dependent) marginal CDF F t (x ), 
then, for each t, the CDF of the conditional density of X t given X t _\ can be expressed 


dC 




( 1 ) 


We will discuss issues related to calibration and simulation below. 

Autocopula models include many familiar time series as special cases. For ex¬ 
ample, it is straightforward to show that an AR(1) process, y t = ocy t -\ + j3 + ce(t), 
can be modelled using the autocopula framework using the marginal distribution 

ZzM i-°0 


Foo(y)**4> 


y/o 2 /(l-a 2 ) 


(where <P denotes the standard normal CDF) and a Gaus- 


1 a 
a 1 


sian copula with mean p = j3/(l — a) and covariance 

Part of the motivation for the use of autocopulas in time series modelling is that, 
while correlation coefficients measure the general strength of dependence, they pro¬ 
vide no information about how the strength of dependence may change across the 
distribution. For instance, in the dataset we consider here there is evidence of tail 
dependence , whereby correlation is higher near the tails of the distribution. We can 
quantify this using the following definition (Joe |1997| , Section 2.1.10). 


1 For more discussion on the theory of copulas and specific examples, see Nelsen 2007 . 
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Definition 2 (Upper and Lower Tail Dependence). If a bivariate copula C is such 
that lim M _>iC(M,M)/(l —u) = Xjj exists, where C(w,«) = 1 — C(l,w) — C(w, 1) + 
C(w, w), then C has upper tail dependence if At/ E (0,1] and no upper tail dependence 
if A u = 0. Similarly, if lim M ^oL’(w, u) /(m) = Al exists, C has lower tail dependence 
if A^ E (0,1] and no lower tail dependence if A/, = 0 

In Figure[2]we show estimates of the quantities C(w, u) / (m) and C(w, w)/(1 — m), 
where here we use the order statistics of the time series X t = ( APAW) t to generate 
a preliminary empirical proxy for the copula function C. It is clear from the figure 
that neither set of values tends towards zero in the limit u 0 or u 1, and we 


conclude that the data exhibit nonzero tail dependence. 


Fig. 2 Estimated values of 

1i 



the quantities C(u,u)/(u) and 
C(m,m)/(1— u ) showing lower 
and upper tail dependence 
in the observed values of 

+ + , l 

+ + 
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3.2 Time Varying Marginal Distribution 


As noted above, the marginal densities for our time series will not be normal. We 
found that the normal inverse Gaussian (NIG) distribution provided a more satisfac¬ 
tory fit. More information about this distribution and its applications can be found 
Barndorff-Nielsen et al. [2012]. Here we review its definition and properties. 


in 


3.2.1 Definition and properties of the NIG distribution 


A non-negative random variable Y has an inverse Gaussian distribution with param¬ 
eters a > 0 and j3 > 0 if its density function is of the form 


fia(y, cc.fi) 


“ y — 3/2 

6 


exp 


( 


(a-fty) 2 \ 

2 Py r 


for y > 0. 


A random variable X has an NIG distribution with parameters a, j 3, ju and 8 if 
X\Y = y ~ N(fi +13y,y) and Y ~ IG(Sy, y 2 ), 
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with 7 := y/a 2 —jS 2 , 0 < |/J| < a and 5 > 0. We then write X ~ NIG(a,/3,/i, 5). 
Denoting by K\ the modified Bessel function of the second kind, the density is given 
by 




5 a exp (5y+j3(;t — ju)) 
7T\/<5 2 + (.x:-jU) 2 1 


(a^/5 2 + (x-ju) 2 ). 


There is a one-to-one map between the parameters of the NIG distribution and the 
mean, variance, skewness and kurtosis of the data. We first use moment matching to 
determine initial estimates for the parameters; we then use these values as our initial 
estimates in a MLE estimation. 

Table [T] shows the estimated parameters of the NIG distribution—assuming that 
the distribution is invariant over time. The corresponding fit to the data is shown 
in Figure [3] where the best fitting normal density is also shown. It can be seen 
that the NIG fit is quite good. However, it is evident from Figure [T] that the time 
series is strongly seasonal. We seek to capture this seasonality through the marginal 
densities by making the parameter 5 of the NIG distribution time-dependent. This 
was achieved by assuming 5 to be constant in each month, and maximizing the 
resulting joint likelihood across the entire data set. The results are shown in Figure[4] 
and the seasonal pattern that is evident in the original data is evident again here. 


Table 1: Results of non time-dependent NIG estimation 


Id a (5 5 

Moment Matching 0.3244 0.0231 0.0210 2.7129 
MLE 0.0980 0.0131 0.0122 2.3799 


Fig. 3 Histogram of observed 
data (APAW), with fitted 
normal distribution and NIG 
distribution 



Fig. 4 Calibrated monthly 
values of 5 from the combined 
NIG likelihood 
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As can be seen in Figure|4j the value of 8 tends to be higher in winter and lower in 
summer. The time series of values appears to be mean reverting with seasonal mean 
and variance. We model the time series using a seasonal mean reverting process for 

Vf = Vs t : 

v,+i = av, + b(t) + a(t)z ,+1 • (2) 

The mean and variance are estimated using periodic functions with periods from 
one year down to three months. 

Simulated and estimated values of 8 t are shown in Figure [5] 20,000 paths were 
simulated using ([2j, and for each month the set of values was used to determine 
quantiles, which were then used to create the coloured patches shown in the figure. 
The darker patches correspond to quantiles nearer to the centre of the distribution, 
and the lighter patches to quantiles nearer the extremes. 


Fig. 5 Calibrated monthly 
values of 8 , together with an 
example of a simulated path, 
as well as a colour contour 
plot of the quantiles from a 
large number of simulated 
paths. 



2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 


Once we have values of 8 t , we can obtain the time varying cumulative distribu¬ 
tion function and time varying density function. The NIG cumulative distribution 
function does not have a closed form solution, so we can compute the CDF using 
Gaussian quadrature to evaluate the following integral. 

F(x t -,oc, p,n, 8) = £j mG (X t ;a,p,n,8 t )dX t (3) 

In next section we explain the procedure to calculate the empirical autocopulas and 
simulate cash flows. 


3.3 Estimating the Empirical Autocopula 


Having estimated the time-dependent NIG densities, we use these to produce a time 
series of values V t =F t (X t ) G [0,1]. If the marginal densities were exact, these would 
be uniformly distributed on [0,1]. In practice, they will only be approximately uni¬ 
form, and we generate an additional empirical marginal density and an empirical 
(auto)copula to capture the joint density of (V t ,V t - 1 ). 
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The empirical autocopula C is estimated by first estimating an empirical joint 
density for (V t ,V t - i) in the form of a strictly increasing continuous function <&(-,-) 
that is piecewise bilinear. The domain [0, l] 2 is partitioned into rectangles contain¬ 
ing approximately similar numbers of samples (V*_ \,V t ), and taking <P to be the 
cumulative integral of the sum of indicator functions for these rectangles, scaled by 
the number of samples in each rectangle. <£> is then used to create strictly increasing 
piecewise linear marginal densities <P\ and <J> 2 - The inverses of these densities are 
therefore also piecewise linear, and when composed with <P they generate a piece- 
wise bilinear copula function C{u\,U 2 ) = ^^~ 1 (m2 ))- 

This process is illustrated in Figure [6] In Figure [6a] we plot the pairs of trans¬ 
formed values ( < Pi(y t -i),<& 2 (Vt)), together with the outlines of rectangles used to 
generate the piecewise bilinear function C. As mentioned, these rectangles contain 
roughly equal numbers of points; constructing the empirical autocopula in this way 
ensures that it is strictly increasing, and well-suited to enable the computations in¬ 
volved in time series simulation (see below) to be carried out efficiently. 

The resulting empirical autocopula C is shown in Figure [6b] This function is 
binlinear on each of the rectangles shown in Figure [6a] but is less regular than it 


looks. The corresponding joint density, ( w i > u i)* is shown in Figure 

be seen that the density is higher near (0,0) and near (1,1), which is consistent with 
the tail dependency observed earlier. 


6c 


It can 



0 1 


(a) Scatter plot of <&i(V t -i) 
against &2 (Vt). Each rectangle 
contains about the same num¬ 
ber of points. 



0 0 


(b) Empirical autocopula 
C(u\,U 2 ) defined to be bilin¬ 
ear on each of the rectangles 
shown in (a). 



0 0 


(c) The empirical density 
d 2 C/du\du 2 , which is con¬ 
stant on each of the rectangles 
shown in (a). 


Fig. 6: Generation of the empirical autocopula 


3.4 Simulation of time series using autocopula 

Armed with the time-dependent NIG densities iy(-), the empirical marginal densi¬ 
ties Fyf-) and the empirical autocopula C( •,•), we can generate simulated values x t 
as follows. 

1. Given an initial value xq, generate vo = Fq(xo). 

2. Fort = 0,1,..., given v t , generate v t +\: 
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a. Set u\ = <Pi (v t ). 

b. Given u\ , create the piecewise linear function C(u) C(u\m)/u \. 

c. Set U 2 = C~ l (U), where U is an independent uniform random draw. 

d. Set v t +\ = 0 2 “ 1 (w 2 )- 

3. For each t > 0, set x t = F t ~ l (v t ). 

Here we have used the fact (already alluded to in (p}) that, if U\ and U 2 are uniform 
random variables whose joint distribution is the copula C(mi,m 2 ), then, for u\ > 0, 
the cumulative density function for t/ 2 , conditional on U\ = u\, is 

F[f/ 2 < ll 2 |C/i = Ml] = ^-(mi,m 2 ) = C ( Ul ' U2 \ 

OU2 U\ 

The proof of this can be found in, for example, Darsow et al. |1992[ . 

The fact that C is a piecewise bilinear function means that C will be piecewise lin¬ 
ear. Moreover, the construction of the empirical copula as described in Section [33] 
ensures that it is an increasing function with a limited number of comers. Its in¬ 
verse can then be constructed readily, and will also be an increasing piecewise linear 
function with a limited number of corners, and so can be evaluated with little com¬ 
putational effort. Indeed, in practice the computation of the final step in the above 
algorithm, the inversion of the time-dependent NIG densities, took more time than 
the copula-related computations. 


4 Results 

In Figure [7]we show a 12-year sample time series for A PAW computed as described 
in Section~ |T4| In addition, we simulated around 700 independent time series and 
computed, for each month, the 99th percentile of values produced in that month 
across all simulations. 

What can be seen in the sample path is the same mixture of quiescent periods 
and periods with extremely large deviations from zero. There is some evidence of 
‘clumps’ of large deviations occuring in winter months, although this is less clear 
than in the original data (see Figure [l}. There is, nevertheless, an increased oc¬ 
curence of large deviations in winter months, as can be seen from the plot of the 
99th percentiles that is superimposed on the sample simulation shown in Figure [7] 

In Figure [ 8 ] we illustrate the fact that the simulations have reproduced the tail 
dependence that was evident in the time series of original observations. The data 
from Figure [2] is reproduced, together with error bars corresponding to the 5th and 
95th percentiles of the values obtained from the simulations. 
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Fig. 7 Simulated values of 
A PAW, together with the 
99th percentile of collected 
monthly values from around 
700 simulations. 
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Fig. 8 Estimated values of 
the quantities C(u,u)/ ( u ) and 
C(u, u)/(l — u) for the orig¬ 
inal observations of A PAW. 
Also shown are error bars 
corresponding to the 5th and 
95th percentiles of the values 
obtained from around 700 
simulations. 
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